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DETAILED ACTION 
Response to Amendment 
1 . In response to the Office Action mailed Sept. 12, 2005, Applicants have submitted an 
Amendment, file Feb. 13, 2006, canceling claims 1-12 and 17, amending claims 13, 18-21, 23, 
28-29, without adding new matter, and arguing to traverse claim rejections. 



Specification 

2. Amendments to the specification have been made to correct certain informalities. Thus, 
the objections have been withdrawn. 



Information Disclosure Statement 

3. The information disclosure statement (IDS) submitted on Feb. 13, 2006 was filed after 
the mailing date of the Office Action on Sept. 12, 2005. The submission is in compliance with 
the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement has been 
considered by the examiner. 



Response to Arguments 

4. Applicant's arguments with respect to claims 13, 18-21, 23, 28, and 29, have been 
considered but are moot in view of the new ground(s) of rejection, next. 



5. Regarding claim 13, Applicants argue that "Russell fails to teach or suggest biasing the 
transition probabilities in dependence on an estimated number of phonetic segments in an 
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utterance as claimed in independent [claim 13]" (Remarks, p. 7, 11. 8-10). However, claim 13 
still recites, "means for biasing the transition probabilities in dependence on the length of the 
utterance", not the number of phonetic segments in an utterance. 

6. Regarding claim 27, Applicant's arguments have been fully considered but they are not 
persuasive, for the following reasons: 

Claim 27 recites "a phonetic segment estimator arranged to output an estimate of the 
number of phonetic segments in the utterance; and a processing module for applying a transition 
bias to the transition probability in response to the output of the estimator." Applicants argue 
that "Russell fails to teach or suggest biasing the transition probabilities in dependence on an 
estimated number of phonetic segments in an utterance as claimed in independent [claim 27]" 
(Remarks, p. 7, 11. 8-10). However, the Examiner still insists that when Russell estimates the 
number of phones-per-second, it inherently teaches estimating the number of phonetic segments 
in the utterance. 

Moreover, on p. 1, col. 2, line 1 1 of "Experimental results", Russell teaches "A^, for each 
occurrence of a phone symbol in the test set [number of phonetic segments in the utterance]" ; 
Fig.3 shows the correlation pk for PRROS estimation window sizes K between 1 and 20, "the 
identities and endpoints of /?/, pk. . .can be estimated during recognition through partial 
traceback, and used to adapt the self-transition probabilities according to eqn. 2 throughout an 
utterance" (p. 2, col. 1), suggesting biasing the transition probabilities in dependence on the 
number of phonetic segments in the utterance). 
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7. The text of those sections of Title 35, U.S. Code not included in this action can be found 
in a prior Office action. 

Claim Rejections - 35 USC §102 

8. Claims 13, 14, 18, 21, and 25-29 are rejected under 35 U.S.C. 102(b) as being clearly 

anticipated by Russell et al. ("Measure of local speaking-rate for automatic speech recognition," 
published May 13, 1999). 

Regarding claims 13, Russell et al. teach a speech recognition system in which an utterance 
to be recognized is represented as a sequence of phonetic segment models (see abstract, discussing 
"phone-level" speaking and estimation) in which a transition probability represents the probability of 
the occurrence of a transition between the models (see lines 4-5, "N-state HMM... transition 
probability" under "ROS compensation"), comprising means (a speech recognizer) for: 

estimating the number of phonetic segments in the utterance to be recognized (see lines 1-2 
under "Phone-level measures of ROS" describing a measure of "phones-per-second" (or phonetic 
segments) in a sentence (synonymous with an utterance); Russell estimates the number of phones- 
per-second, which inherently teaches estimating the number of phonetic segments in the 
utterance); and 

biasing the transition probabilities in dependence on the length of the utterance (see lines 9- 
10 under "ROS compensation," which discuss the state transition probabilities "scaled for fast 
speech," implying dependence on length). 

Regarding claim 29, Russell et al. teach a method of speech recognition in which... transition 
between models, the method comprising biasing the transition probabilities in dependence of the 
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number of phonetic segments in the utterance (see lines 1-2 under "Phone-level measures of ROS" 
describing a measure of "phones-per-second" (or phonetic segments) in a sentence (synonymous 
with an utterance); while Russell estimates the number of phones-per-second, it inherently 
teaches estimating the number of phonetic segments in the utterance; see also p. 1, col. 2, line 1 1 
of "Experimental results", which teaches "K, for each occurrence of a phone symbol in the test set 
[number of phonetic segments in the utterance]"; Fig. 3 shows the correlation for PRROS 
estimation window sizes K between 1 and 20, "the identities and endpoints of /?/,... , p*. , .can be 
estimated during recognition through partial traceback, and used to adapt the self-transition 
probabilities according to eqn. 2 throughout an utterance (p. 2, col. 1)," suggesting biasing the 
transition probabilities in dependence on the number of phonetic segments in the utterance). 

Regarding claim 14, Russell et al. teach wherein the biasing means comprise means for 
applying a transition bias to each of the transition probabilities between a plurality of phonetic 
segment models (see lines 18-21 under "ROS compensation"). 

Regarding claim 18, Russell et al. teach wherein the estimating means comprises a speaker 
specific rate of speech estimator (see Abstract). 

Regarding claim 21, Russell et al. teach wherein the transition bias is set in response to the 
resuh of the estimating means (see lines 6-10 under "ROS compensation," which discuss a rate of 
speech compensation which scales (or biases) the state transition probabilities according to the 
speaker specific rate of speech). 

Regarding claim 25, Russell et al. teach wherein the, or each, phonetic segment comprises a 
phoneme (see lines 1-2 under "Phone-level measures of ROS" describing "phone-level" measures 



Application/Control Number: 1 0/020,895 Page 6 

Art Unit: 2626 

wherein a "phone" is a sound unit of speech also known as phoneme, or aliophone, which is 
predictable phonetic variant of a phoneme). 

Regarding claim 26, Russell et al. teach a system wherein the, or each, utterance comprises a 
word (see line 3 under "Phone-ievel measures of ROS" describing phones "in a sentence," wherein a 
spoken sentence is a string of uttered words). 

Regarding claim 27, Russell et al. teach wherein an utterance to be recognized is represented 
as a sequence of phonetic segment models in which a transition probability represents the probability 
of occurrence of a transition between the models (see lines 1-5, "N-state HMM... transition 
probability" under "ROS compensation"), comprising: 

a phonetic segment estimator arranged to output an estimate of the number of phonetic 
segments in the utterance (see lines 1-2 under "Phone-level measures of ROS," wherein the utterance 
is a sentence; while Russell estimates the number of phones-per-second, it inherently teaches 
estimating the nimiber of phonetic segments in the utterance); and 

a processing module for applying a transition bias to the transition probability in response to 
the output of the estimator (see lines 6-10 under "ROS compensation," which discuss a rate of speech 
compensation which scales (or biases) the state transition probabilities according to the speaker 
specific rate of speech). 

Moreover, on p. 1, col. 2, line 1 1 of "Experimental results", Russell teaches "/T, for each 
occurrence of a phone symbol in the test set [number of phonetic segments in the utterance]" ; 
Fig.3 shows the correlation pk for PRROS estimation window sizes K between 1 and 20, "the 
identities and endpoints of pi,.,., pk...car\ be estimated during recognition through partial traceback, 
and used to adapt the self-transition probabilities according to eqn. 2 throughout an utterance" (p. 2, 
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col. 1), suggesting biasing the transition probabilities in dependence on the number of phonetic 
segments in the utterance). 

Regarding claim 28, Russell et al. teach a portable communications device including a speech 
recognition system (see line 16 under "experimental procedure," describing the use of a "DERA 
ASTREC speech recognizer," which is a state-of-the-art reconfigurable continuous automatic speech 
engine (or system) from The Defense Evaluation and Research Agency, which is suitable for 
deployment in command-and control direct voice input applications in a wide range of existing 
commercial markets (e.g. automotive, telephone-based IVR systems, TV control, etc.) and has 
already been trialed in a range of applications (e.g. European Fighter Aircraft), which reads on 
implementation in portable communication devices). 



Claim Rejections - 35 USC § 103 
9. Claim 19 rejected under 35 U.S.C. 103(a) as being unpatentable over Russell et al, in view of 
James et al. ("A Fast Lattice-Based Approach to Vocabulary Independent Wordspotting," ICASSP 
1994, pp. 377-380). 

Regarding claim 19, Russell et al. fail to teach a system wherein the estimating means 
comprises a Free Order Viterbi decoder. However, Viterbi decoders are well known in the field of 
speech recognition as evidenced by James et al., which disclose implementing a Free-Order Viterbi 
decoder (a null-grammar phone network, see page 1-379, lines 14-15 of section 3.3). 

It would have been obvious to one of ordinary skill in the art at the time the invention was 
made to modify the teaching elements of Russell et al. with those of James et al., because James et al. 
teach that this would increase flexibility by being able to search for any word and speed of retrieval 
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(see page 1-377, sixth paragraph, lines 1-5; see also US Patent 6,073,095 to Dharanipragada et al. 
which references this publication in the "Prior Art" section of column 1). 

10. Claim 20 is rejected under 35 U.S.C. 103(a) as being unpatentable over Russell et al. in view 
of Bergstrom et al., US Patent No. 5,737,716 (filed Dec. 26, 1995). 

Regarding claim 20, Russell et al. fail to teach a system wherein the estimating means 
comprises a neural network classifier. However, this feature is well known in the art as evidenced by 
Bergstrom et al., which disclose a neural network controlled speech analysis processor that includes a 
neural network which manages speech characterization, encoding, decoding, and reconstruction 
methodologies, reading on a neural network classifier (see abstract). 

It would have been obvious to one of ordinary skill in the art at the time the invention was 
made to modify the teaching elements of Gerber with those of Bergstrom et al., because Bergstrom et 
al. teach that this would "provide for rapid development, improved classification accuracy, improved 
speech analysis and speech synthesis architectures, and improved immunity to interference when 
trained with appropriate characteristic features" (see column 3, lines 15-19). 

11. Claims 15, 16, and 30 are rejected under 35 U.S.C. 103(a) as being unpatentable over Russell 
et al. as applied to claims 14 and 29, above, in view of Gupta et al. (US Patent No. 5,390,278). 

Regarding claims 15 and 16, Russell et al. fail to teach a system operable to recognize 
utterance from a recognition vocabulary, wherein the transition bias is calculated as the transition 
bias which maximizes recognition performance on a validation data set which represents, or has the 
same vocabulary as, the recognition vocabulary. 

However, this procedure would have been obvious to one of ordinary skill in the art at the 
time the invention was made given the invention by Gupta et al.. Gupta et al. teach transition 
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probabilities calculated, with "the one resuhing in the best score" stored (see column 17, line 48-49), 
suggesting choosing a transition bias which maximizes recognition performance, and a validation 
data set representing, or having the same vocabulary as, the recognition vocabulary (see column 12, 
lines 45-49 and column 14, lines 21-23). 

Regarding claim 30, Russell et al. fail to teach comprising decoding the sequence of phonetic 
segment models after application of the transition bias. 

However, this procedure would have been obvious to one of ordinary skill in the art at the 
time the invention was made given the invention by Gupta et al.. Gupta et al. suggest decoding the 
sequence of phonetic segment models after applying a bias (see Abstract and column 18, first 
paragraph; decoding is done by the A* search method as illustrated in Fig. 12a., element 418). 
Motivation for the combination would be to save the unnecessary decoding before the application of 
the transition bias, wherein the transition bias improves recognition. 

12. Claim 31 is rejected under 35 U.S.C. 103(a) as being unpatentable over Russell et al. as 
applied to claim 29, above, in view of Gupta et al. (US Patent No. 6,138,095). 

Regarding claim 31, Russell et al. fail to teach comprising decoding the sequence of phonetic 
segment models without the application of transition bias (as specified in the rejection of claim 14, 
Russell et al. teaches only a transition bias) and normalizing the resulting scores by a contribution 
proportional to the transition bias. 

However, this procedure would have been obvious to one of ordinary skill in the art at the 
time the invention was made given the invention by Gupta et al.. See column 3, lines 9-24 and 
column 3, line 66 through column 4, line 2 of Gupta et al. which discloses normalizing rejection 
thresholds and likelihood ratios (similar to resulting scores) by the magnitude of a null hypothesis 
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probability (similar to transition probabilities). Motivation for the combination would be to simplify 
processing, in the case where the transition biases are too large, too small, or not integral numbers. 

13. Claim 32 is rejected under 35 U.S.C. 103(a) as being unpatentable over Russell and Gupta et 
al. (US Patent No. 6,138,095), as applied to claim 31 above, further in view of Ueyama et al. (US 
Patent Application Publication 2001/0056346 Al). 

Regarding claim 32, Russell et al. fail to teach comprising calculating the transition bias in 
parallel with the decoding of the sequence of phonetic segment models. 

However, this procedure is well known in the art as evidenced by Ueyama et al., which 
disclose computing the output probabilities (synonymous to a transition probability) of acoustic 
models in parallel to decoding of speech parameters (synonymous with a sequence of phonetic 
segment models). See paragraph [0095]. Motivation for the combination would be to save time. 

14. Claims 22-24 are rejected under 35 U.S.C. 103(a) as being unpatentable over Russell et al. as 
applied to claim 21, above, in view of Schwartz et al. (US Patent No. 5,621,859), and further in view 
of Gupta et al. (US Patent No. 6,1 38,095). 

Regarding claims 22-24, Russell et al. fail to teach a system comprising table look-up means 
for setting the transition bias in accordance with the number of phonetic segments in the utterance, 
and direct setting means for setting the transition bias as proportional or equal to the number of 
phonetic segments in the utterance. 

However, a system comprising "table look-up means for setting the transition bias" is well 
known in the art as evidenced by Schwartz et al., which disclose a lookup-table where transition 
probabilities are stored for each transition from each grammar state to each possible following word 
(see column 15, lines 15-18 and 27-29; see also Figure 8). Motivation for the combination would be 
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to reduce the amount of computation done by the system by storing transition probabilities already 
calculated. 

Both Russell and Schwartz et al. fail to teach setting the transition bias in accordance with, or 
proportional to, the number of phonetic segments in the utterance. 

However, setting the transition bias in accordance with, or proportional to, the number of phonetic 
segments in the utterance would have been obvious to one of ordinary skill in the art given the 
invention by Gupta et al.. Gupta et al. disclose that rejecting performance of speech recognition can 
be improved if a different rejection threshold is selected for each utterance length (see column 3, 
lines 46-48), which is a synonymous to the idea of setting different transition biases that is utterance- 
length dependent or proportionally dependent, which includes setting the bias equal to the length. 
Gupta et al. teach that this would improve recognition performance for different utterance lengths 
(see column 1, line S8, through column 2, line 3). 



Conclusion 

15. Applicant's amendment necessitated the new ground(s) of rejection presented in this 
Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). 
Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed xmtil after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1.136(a) will be calculated fi-om the mailing date of the advisory action. In no event, 
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however, will the statutory period for reply expire later than SIX MONTHS from the date of this 
final action. 

16. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Eunice Ng whose telephone number is 571-272-2854. The 
examiner can normally be reached on Monday through Friday, 8:30 a.m. - 5:00 p.m.. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Hudspeth can be reached on 571-272-7843. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 

17. Please note the recent change in art unit designation fi-om 2654 to 2626. 
en 

March 24, 2006 
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